feat(helm): add pod extensibility controls by thekkagent · Pull Request #380 · openabdev/openab

thekkagent · 2026-04-16T01:28:34Z

What problem does this solve?

OpenAB's Helm chart is currently strong on the minimal install path, but it is too thin for many real Kubernetes deployments.

It does not currently expose several common deployment controls that operators typically expect from a reusable open-source chart, including:

imagePullSecrets
probes (livenessProbe, readinessProbe, startupProbe)
lifecycle
initContainers
extraContainers
extra volumes / volume mounts
serviceAccountName binding
chart-managed ServiceAccount with IRSA support
Role / ClusterRole RBAC
PodDisruptionBudget
other pod-level and cluster-level deployment settings that are common in production clusters

This leaves users with a few unattractive options:

rebuild a custom image for relatively minor deployment-specific needs
patch rendered manifests after helm template
fork the chart just to add standard Kubernetes fields

A concrete example: operators who need agents to access AWS resources (via IRSA) or the Kubernetes API currently have to fork the chart to add ServiceAccount annotations and RBAC rules.

Closes: #397

Discord Discussion URL: https://discordapp.com/channels/1491295327620169908/1493841502529523732

At a Glance

┌─────────────────────────────────────┐
│ Current OpenAB Helm Chart           │
│ minimal install path                │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ Missing common chart hooks          │
│ - imagePullSecrets                  │
│ - probes / lifecycle                │
│ - initContainers / extraContainers  │
│ - extra volumes / mounts            │
│ - ServiceAccount + RBAC             │
│ - PodDisruptionBudget               │
│ - polymorphic env (valueFrom)       │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ This PR                             │
│ pod extensibility + SA/RBAC/PDB     │
│ + env + template fixes              │
└─────────────────────────────────────┘

Prior Art & Industry Research

OpenClaw:

I reviewed the local OpenClaw repository and its Kubernetes deployment manifests. While OpenClaw does not currently ship a Helm chart in this repo, it does treat startup bootstrap as a first-class deployment concern. In particular, it uses an initContainer to prepare configuration and workspace state before the main container starts.

This suggests:

initContainers are a reasonable place for startup preparation
bootstrap logic should be treated as deployment design, not an ad hoc workaround
a hardened main container can stay simpler when initialization is separated

Reference:

scripts/k8s/manifests/deployment.yaml

Hermes Agent:

I reviewed Hermes Agent's Docker and deployment documentation. Hermes takes a clearer stance on tool installation: stable toolchains should primarily be handled through custom images or clearly defined mutable runtime environments, not by stretching the chart into a package manager.

This suggests:

serious toolchains should prefer custom images
runtime bootstrap can exist, but it should stay lightweight and bounded
Helm should expose deployment extension points, not replace image design

References:

Dockerfile
website/docs/user-guide/docker.md
website/docs/getting-started/nix-setup.md

Other references:

I reviewed Bitnami charts and Helm / Kubernetes best practices. Bitnami's more mature charts commonly expose imagePullSecrets, serviceAccount, probes, lifecycle, initContainers, extraContainers, extraVolumes, extraVolumeMounts, extraEnvVarsCM, extraEnvVarsSecret, extraDeploy, and pdb. This PR adopts the subset of that surface area that fits under Deployment.spec.template plus the essential cluster-level resources (ServiceAccount, Role, ClusterRole, PDB).

Relevant upstream guidance also supports this direction:

Helm chart values and template best practices
Kubernetes guidance for initContainers
Kubernetes guidance for private registry pulls via imagePullSecrets
Kubernetes RBAC best practices for namespaced vs cluster-scoped access

Proposed Solution

This PR implements full pod / deployment extensibility plus chart-managed ServiceAccount, RBAC, and PodDisruptionBudget. Discord feedback from @tonylee led to expanding beyond the original Phase 1 scope to cover AWS IRSA + K8s API access scenarios.

File	Purpose
`charts/openab/values.yaml`	Helm values for pod extensibility, serviceAccount, rbac, and podDisruptionBudget
`charts/openab/templates/_helpers.tpl`	Helpers for per-agent image pull policy/secrets, ServiceAccount name resolution (create + external fallback), reserved-key-safe pod metadata merging, and unified discord.enabled gate
`charts/openab/templates/deployment.yaml`	Render pod-level extensibility controls into `Deployment.spec.template`
`charts/openab/templates/configmap.yaml`	Filter map-typed env from config.toml, escape string env via `toJson`, use unified discord gate
`charts/openab/templates/secret.yaml`	Use unified discord gate helper
`charts/openab/templates/serviceaccount.yaml`	Per-agent ServiceAccount (opt-in), with annotations + automount control
`charts/openab/templates/role.yaml`	Per-agent namespaced Role (opt-in)
`charts/openab/templates/rolebinding.yaml`	Per-agent RoleBinding (opt-in)
`charts/openab/templates/clusterrole.yaml`	Per-agent ClusterRole (opt-in)
`charts/openab/templates/clusterrolebinding.yaml`	Per-agent ClusterRoleBinding (opt-in)
`charts/openab/templates/pdb.yaml`	Per-agent PodDisruptionBudget (opt-in)
`charts/openab/tests/deployment_test.yaml`	Pod extensibility coverage (17 tests)
`charts/openab/tests/env_test.yaml`	Env polymorphic rendering + configmap filtering (5 tests)
`charts/openab/tests/serviceaccount_test.yaml`	ServiceAccount creation coverage (8 tests)
`charts/openab/tests/rbac_test.yaml`	RBAC coverage (9 tests)
`charts/openab/tests/pdb_test.yaml`	PodDisruptionBudget coverage (6 tests)

Implemented scope:

Pod / Deployment extensibility

imagePullSecrets (global + per-agent, with explicit opt-out, supports both K8s native and shorthand formats)
per-agent imagePullPolicy (global default + per-agent override)
per-agent serviceAccountName binding (for referencing externally-created SAs)
livenessProbe, readinessProbe, startupProbe, lifecycle
initContainers, extraContainers
extraVolumes, extraVolumeMounts
podAnnotations, podLabels (global + per-agent, merged with reserved-key protection)
polymorphic env rendering (string values + valueFrom maps via kindIs "map")
TOML env value escaping via toJson (fixes pre-existing injection risk)

Chart-managed resources (Phase 2)

ServiceAccount (opt-in via serviceAccount.create, with annotations for IRSA and automountServiceAccountToken control)
namespaced Role + RoleBinding (opt-in via rbac.create)
cluster-scoped ClusterRole + ClusterRoleBinding (opt-in via rbac.createClusterRole)
PodDisruptionBudget (opt-in via podDisruptionBudget.enabled, supports both minAvailable and maxUnavailable)

Template consistency

unified discord.enabled gate via openab.discordEnabled helper (consistent behavior across configmap, deployment, secret)

Explicitly out of scope:

extra ConfigMap / Secret creation — already covered by existing extraVolumes/extraVolumeMounts/envFrom, operators create their own K8s resources and reference them
generic extra objects such as extraDeploy — deferred

For the "install tools" question specifically, the implementation keeps two clear paths:

Custom image — the preferred path for stable, repeatable, production-grade toolchains
initContainers + shared volume — a lightweight bootstrap path for small binaries or startup initialization

Safety & correctness

Three design decisions are load-bearing for correctness.

Reserved pod-metadata keys cannot be hijacked. agentPodLabels and agentPodAnnotations use mergeOverwrite(dict, global, per-agent, reserved), where the reserved key set is merged last. This guarantees:

checksum/config annotation is always the chart-computed hash — users cannot accidentally pin it or disable rollout-on-config-change
app.kubernetes.io/{name,instance,component} labels on the pod template always match spec.selector.matchLabels, so Deployment → Pod selector matching can never be broken by a user-supplied podLabels entry

Without this protection, a user who set podLabels.app.kubernetes.io/name: custom would silently produce duplicate YAML keys and a Deployment whose ReplicaSet could no longer match its own Pods.

Per-agent imagePullSecrets: [] explicitly opts out of the global list. The helper uses hasKey instead of a truthy check, so three states are distinguishable per agent:

key omitted → inherit .Values.imagePullSecrets
imagePullSecrets: [] → opt out (no pull secrets, even if global is set)
imagePullSecrets: [foo] → replace with per-agent list

All new resources are opt-in with safe defaults. serviceAccount.create, rbac.create, rbac.createClusterRole, and podDisruptionBudget.enabled all default to false. Existing deployments see no behavioral change on upgrade — only new resources are created when operators explicitly opt in.

Backward compatibility for ServiceAccount binding. The existing agents.<x>.serviceAccountName field still works for referencing externally-created SAs. When serviceAccount.create: true is set, the chart-created SA takes precedence over the external reference.

Why this approach?

I do not think OpenAB should model every Kubernetes field at once, and I do not think Helm should become the primary mechanism for packaging arbitrary tools.

At the same time, the current chart is thin enough that users are pushed toward forking it for fairly normal deployment requirements. That creates unnecessary friction for a public chart.

This approach takes the middle path:

keep the default chart simple — all new controls are opt-in
expose the extension points that operators commonly expect
keep custom images as the primary answer for serious tool installation
treat initContainers bootstrap as a lightweight complement, not the main packaging model
make chart-managed RBAC available for teams who need agents to access cloud resources or the K8s API, without requiring a fork

Tradeoffs and limitations:

the chart surface will grow
pod-metadata merge rules must be unambiguous and predictable — addressed by the reserved-last mergeOverwrite pattern
PodDisruptionBudget combined with the hardcoded replicas: 1 has a known edge case (minAvailable: 1 prevents node drains) — documented with a comment in values.yaml

Alternatives Considered

1. Pod extensibility only; defer SA/RBAC/PDB — initially chosen, then expanded

The first iteration of this PR scoped to Deployment.spec.template only, deferring SA/RBAC/PDB. After Discord feedback (specifically operators needing IRSA + K8s API access), the scope was expanded to include chart-managed resources since those are the exact gaps that otherwise force a fork.

2. Implement everything at once — chosen

Including SA/RBAC/PDB in the same PR keeps related extensibility changes together. All new resources follow the existing per-agent multi-resource loop pattern and share the same helpers (e.g. agentFullname, selectorLabels). All defaults are opt-in, so the risk profile is no higher than the Phase 1 changes.

3. Extra ConfigMap / Secret creation — rejected

The existing extraVolumes/extraVolumeMounts/envFrom fields already cover this use case. Users create their own ConfigMaps/Secrets (with their preferred lifecycle management) and reference them. Adding chart-managed extras would duplicate functionality without clear benefit.

Validation

helm lint charts/openab passes
Rendering coverage added for:
- imagePullSecrets (global + per-agent, both K8s native and shorthand formats)
- per-agent imagePullPolicy
- initContainers, extraContainers
- extraVolumes / extraVolumeMounts
- probes (livenessProbe, readinessProbe, startupProbe), lifecycle
- serviceAccountName + chart-managed ServiceAccount
- ServiceAccount automountServiceAccountToken + IRSA annotations
- Role + RoleBinding + ClusterRole + ClusterRoleBinding
- PodDisruptionBudget (minAvailable, maxUnavailable, percentage values)
- pod annotations and labels (global + per-agent merged)
- polymorphic env rendering (string, valueFrom, mixed)
Hardening coverage added for:
- reserved selector labels (app.kubernetes.io/{name,component}) cannot be hijacked by user podLabels
- checksum/config annotation cannot be overridden by user podAnnotations
- per-agent imagePullSecrets: [] opts out of the global list
- per-agent podLabels override global for the same non-reserved key
- list-typed env values rejected with clear error message
- map-typed env values filtered from config.toml (TOML cannot represent valueFrom)
- PDB fails loudly when both minAvailable and maxUnavailable are set
- PDB fails loudly when neither is set
Backward compatibility:
- existing serviceAccountName still resolves to external SA
- serviceAccount.create: true overrides external reference when both are set
Manual deploy-oriented verification on a real cluster

Validation command:

helm unittest charts/openab

Result: 61 passed, 0 failed (across 8 suites)

DC: https://discord.com/channels/1491295327620169908/1493841502529523732/1494148867380219956

thekkagent · 2026-04-16T05:31:05Z

Follow-up: polymorphic env rendering

This came up in Discord feedback — the current env map only supports simple key: "string" pairs, which makes it impossible to inject downward API fields or reference external secrets. For example, there's no way to pass metadata.name as an env var to the agent container.

Added kindIs map type detection so both formats work without breaking existing usage:

env:
  # existing usage — unchanged
  SIMPLE_VAR: "hello"
  # new — valueFrom and other complex types
  MY_POD_NAME:
    valueFrom:
      fieldRef:
        fieldPath: metadata.name

Changes:

deployment.yaml: env range uses kindIs "map" to branch between value: and toYaml
values.yaml: updated comment examples
helm-template-test.sh: added Test 19-21 (string, valueFrom, mixed)

masami-agent

Review — PR #380

Well-structured PR, @thekkagent. The phased approach (pod/deployment extensibility only, deferring PDB/RBAC) is the right call. The reserved-key protection and imagePullSecrets opt-out design are particularly thoughtful.

✅ What looks good

Reserved label/annotation protection — mergeOverwrite with reserved keys merged last prevents selector label hijacking and checksum/config override. This is a real safety concern that many charts get wrong.
imagePullSecrets three-state design — key omitted (inherit global), [] (opt out), [foo] (override). Using hasKey instead of truthy check is correct.
Polymorphic env rendering — kindIs "map" check allows both MY_VAR: "hello" and MY_VAR: { valueFrom: ... }. Backward compatible with existing string-only env.
Test coverage — 14 new tests (Tests 8-21) covering all new features plus hijack protection. The test harness fix (PASS=$((PASS + 1)) instead of ((PASS++))) for set -e compatibility is a nice catch.
No src/ changes — pure Helm chart, lowest risk to runtime.
Scope discipline — explicitly lists what is out of scope (PDB, RBAC, extraDeploy).

🔴 Must fix before merge

1. sidecars should be extraContainers or sidecarContainers

sidecars: []

The Bitnami convention uses sidecars but Kubernetes 1.28+ has native sidecar support via initContainers with restartPolicy: Always. Using sidecars as a field name that renders into containers: could confuse users who expect native sidecar behavior. Consider renaming to extraContainers to be explicit about where they render (in the containers array, not as native sidecars).

2. Missing Closes # in PR description

The PR description says Closes: N/A. If there is a related issue, link it. If not, consider opening one for traceability.

🟡 Non-blocking

3. values.yaml default agent block has new fields at the bottom

The new per-agent fields (serviceAccountName, podAnnotations, podLabels, probes, lifecycle, initContainers, sidecars, extraVolumes, extraVolumeMounts) are added after resources: {} but before nodeSelector. This is fine, but grouping them with a comment header (e.g. # Pod extensibility) would improve readability.

4. Empty object defaults vs omitted

livenessProbe: {}
readinessProbe: {}
startupProbe: {}
lifecycle: {}

These default to {} in values.yaml. The templates use {{- with $cfg.livenessProbe }} which treats {} as falsy in Helm, so they won't render. This is correct behavior, but a brief comment in values.yaml noting that {} means "disabled" would help operators.

5. Test file is getting large (232 new lines)

The test script is now ~320 lines. Not blocking, but consider splitting into separate test files by feature area in a follow-up (e.g. test-extensibility.sh, test-bot-messages.sh).

📝 Summary

Priority	Item	Status
🔴	Rename `sidecars` to `extraContainers`	Needs fix
🔴	Missing issue link	Needs fix
🟡	Comment header for new values fields	Non-blocking
🟡	Document `{}` means disabled for probes	Non-blocking
🟡	Test file size	Non-blocking

Solid work overall. The reserved-key protection pattern is something other charts should learn from. Two small fixes and this is ready to approve.

masami-agent · 2026-04-16T08:11:09Z

@thekkagent Before we can proceed with approval, three requests:

1. Rename sidecars → extraContainers
K8s 1.28+ introduced native sidecars (initContainers with restartPolicy: Always). Your field renders into the main containers array, not as native sidecars. extraContainers makes this explicit and avoids confusion.

2. Post a rendered manifest
Please run helm template with these values enabled and paste the output:

imagePullSecrets
initContainers (at least one)
livenessProbe
extraVolumes / extraVolumeMounts
podLabels with a custom key

This lets us verify indentation and empty-value behavior in the final YAML without needing a local Helm install.

3. Confirm reserved key protection works
Run this and paste the result:

helm template test charts/openab/ \
  --set-json 'agents.kiro.podLabels={"app.kubernetes.io/name":"hacked"}' \
  | grep 'app.kubernetes.io/name'

Expected: output should show the chart-computed name (e.g. openab), not hacked. This confirms that user-supplied podLabels cannot break Deployment → Pod selector matching.

thekkagent · 2026-04-16T11:02:32Z

Addressing review feedback + follow-up improvements

Rebased on latest main (v0.7.7-beta.1) and pushed 4 clean commits:

1. `feat(helm): add pod extensibility controls`

Original PR scope — pod/deployment extensibility. Also fixes Test 1-7 to pass botToken after upstream discord.enabled gate change (#394).

2. `feat(helm): support valueFrom and other complex env var types`

From Discord feedback — the env map only supported simple key: "string" pairs, making it impossible to inject downward API fields or reference external secrets.

Added kindIs "map" type detection so both formats work:

env:
  SIMPLE_VAR: "hello"
  MY_POD_NAME:
    valueFrom:
      fieldRef:
        fieldPath: metadata.name

deployment.yaml: map values render via toYaml (e.g. valueFrom blocks)
configmap.yaml: map values are filtered out — config.toml only supports string env vars, valueFrom is handled at the pod level by Kubernetes
Tests 19-21 added (string, valueFrom, mixed)

3. `refactor(helm): rename sidecars to extraContainers, add value comments`

Renamed sidecars → extraContainers per review — this field renders into the main containers array, not as K8s 1.28+ native sidecars
Added # -- Pod extensibility comment header
Documented {} = disabled for probes and lifecycle

4. `fix(helm): escape TOML env values, reject list env vars, support K8s imagePullSecrets format`

Fixed pre-existing TOML injection risk where double quotes in env string values produced invalid TOML — now uses toJson for proper escaping
Added kindIs "slice" guard in both deployment.yaml and configmap.yaml to reject list-typed env values with a clear error message
Derived agentPodLabels reserved keys from selectorLabels helper to prevent future drift between Deployment selector and pod labels
imagePullSecrets now supports both K8s native object format (- name: regcred) and shorthand string format (- regcred)

Re: test file size

Agreed — will split test files by feature area in a follow-up PR.

Rendered manifest verification

helm template with imagePullSecrets, initContainers, livenessProbe, extraVolumes, podLabels, and polymorphic env

---
# Source: openab/templates/secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: test-openab-kiro
  labels:
    helm.sh/chart: openab-0.7.7-beta.1
    app.kubernetes.io/name: openab
    app.kubernetes.io/instance: test
    app.kubernetes.io/component: kiro
    app.kubernetes.io/version: "0.7.7-beta.1"
    app.kubernetes.io/managed-by: Helm
  annotations:
    "helm.sh/resource-policy": keep
type: Opaque
data:
  discord-bot-token: "dGVzdA=="
---
# Source: openab/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: test-openab-kiro
  labels:
    helm.sh/chart: openab-0.7.7-beta.1
    app.kubernetes.io/name: openab
    app.kubernetes.io/instance: test
    app.kubernetes.io/component: kiro
    app.kubernetes.io/version: "0.7.7-beta.1"
    app.kubernetes.io/managed-by: Helm
data:
  config.toml: |
    [discord]
    bot_token = "${DISCORD_BOT_TOKEN}"
    allowed_channels = ["YOUR_CHANNEL_ID"]
    allowed_users = []
    allow_bot_messages = "off"

    [agent]
    command = "kiro-cli"
    args = ["acp","--trust-all-tools"]
    working_dir = "/home/agent"
    env = { SIMPLE_VAR = "hello" }

    [pool]
    max_sessions = 10
    session_ttl_hours = 24

    [reactions]
    enabled = true
    remove_after_reply = false
---
# Source: openab/templates/pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-openab-kiro
  labels:
    helm.sh/chart: openab-0.7.7-beta.1
    app.kubernetes.io/name: openab
    app.kubernetes.io/instance: test
    app.kubernetes.io/component: kiro
    app.kubernetes.io/version: "0.7.7-beta.1"
    app.kubernetes.io/managed-by: Helm
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
# Source: openab/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-openab-kiro
  labels:
    helm.sh/chart: openab-0.7.7-beta.1
    app.kubernetes.io/name: openab
    app.kubernetes.io/instance: test
    app.kubernetes.io/component: kiro
    app.kubernetes.io/version: "0.7.7-beta.1"
    app.kubernetes.io/managed-by: Helm
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app.kubernetes.io/name: openab
      app.kubernetes.io/instance: test
      app.kubernetes.io/component: kiro
  template:
    metadata:
      annotations:
        checksum/config: 3a5a7cd5cc371b23056d7db4bf43ee5bcee71347f0d1297d8b866a9023199911
      labels:
        app.kubernetes.io/component: kiro
        app.kubernetes.io/instance: test
        app.kubernetes.io/name: openab
        team: platform
    spec:
      imagePullSecrets:
        - name: "ghcr-secret"
      securityContext:
        fsGroup: 1000
        runAsGroup: 1000
        runAsNonRoot: true
        runAsUser: 1000
      initContainers:
        - command:
          - sh
          - -c
          - echo init
          image: busybox:1.36
          name: bootstrap
      containers:
        - name: openab
          image: "ghcr.io/openabdev/openab:0.7.7-beta.1"
          imagePullPolicy: IfNotPresent
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
          env:
            - name: DISCORD_BOT_TOKEN
              valueFrom:
                secretKeyRef:
                  name: test-openab-kiro
                  key: discord-bot-token
            - name: HOME
              value: /home/agent
            - name: MY_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: SIMPLE_VAR
              value: "hello"
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
          volumeMounts:
            - name: config
              mountPath: /etc/openab
              readOnly: true
            - name: data
              mountPath: /home/agent
            - mountPath: /scratch
              name: scratch
      volumes:
        - name: config
          configMap:
            name: test-openab-kiro
        - name: data
          persistentVolumeClaim:
            claimName: test-openab-kiro
        - emptyDir: {}
          name: scratch

Reserved key protection verification

$ helm template test charts/openab/ \
  --set-json 'agents.kiro.podLabels={"app.kubernetes.io/name":"hacked"}' \
  | grep 'app.kubernetes.io/name'

    app.kubernetes.io/name: openab
    app.kubernetes.io/name: openab
    app.kubernetes.io/name: openab
      app.kubernetes.io/name: openab
        app.kubernetes.io/name: openab

User-supplied podLabels cannot override reserved selector labels. All 22 tests pass.

masami-agent

Re-review — PR #380

All three requested items addressed. Verified the latest commits:

✅ Requested changes (resolved)

sidecars → extraContainers — renamed in values.yaml, deployment.yaml, and tests ✓
Rendered manifest posted — indentation correct, imagePullSecrets / initContainers / livenessProbe / extraVolumes / polymorphic env all render properly ✓
Reserved key protection verified — app.kubernetes.io/name stays openab even when user sets hacked. Protection works ✓

✅ Bonus improvements

TOML injection fix — env values with double quotes now properly escaped via toJson
kindIs "slice" guard — list-typed env values rejected with clear error
imagePullSecrets dual format — supports both K8s native - name: regcred and shorthand - regcred
Reserved labels derived from selectorLabels — prevents future drift
Rebased on main (v0.7.7-beta.1)
22 tests pass

Summary

High-quality contribution. The reserved-key protection, TOML injection fix, and polymorphic env support are all well-designed. Ready for maintainer approval.

LGTM.

obrutjack · 2026-04-16T13:56:57Z

Thorough work. Reviewed the full diff and the rendered manifest output.

What I like:

Reserved key protection derived from selectorLabels helper — prevents drift, smart design
TOML injection fix with toJson — caught a pre-existing risk
kindIs "slice" guard with clear fail message — good DX
imagePullSecrets dual format support — pragmatic
22 tests passing, rendered manifest looks clean

One question before I approve: the configmap.yaml change filters out kindIs "map" env values from config.toml (since TOML can't represent valueFrom). What happens if an operator sets ALL env vars as valueFrom maps? Does config.toml still render a valid [agent] section with no env line, or does it produce env = {}?

Add pod/deployment extensibility to the Helm chart, focused on Deployment.spec.template scope: imagePullSecrets (global + per-agent with opt-out), per-agent imagePullPolicy, serviceAccountName, probes, lifecycle, initContainers, sidecars, extraVolumes, extraVolumeMounts, podAnnotations and podLabels with reserved-key protection. Also fixes Test 1-7 to pass botToken after upstream discord.enabled gate change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The env map previously only supported simple key-value strings. This adds polymorphic rendering via kindIs "map" so users can use valueFrom (fieldRef, secretKeyRef, configMapKeyRef, etc.) while maintaining full backward compatibility with existing string values. In deployment.yaml, map-typed values render as toYaml (e.g. valueFrom blocks). In configmap.yaml, map-typed values are filtered out since config.toml only supports string env vars — valueFrom is a Kubernetes concept handled at the pod level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rename `sidecars` to `extraContainers` to avoid confusion with K8s 1.28+ native sidecar support — this field renders into the main containers array, not as native sidecars. Also adds a "Pod extensibility" comment header and documents that `{}` means disabled for probes and lifecycle fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…imagePullSecrets format - Fix pre-existing TOML injection risk where double quotes in env string values produced invalid TOML. Use toJson for proper escaping. - Add kindIs "slice" guard in both deployment.yaml and configmap.yaml to reject list-typed env values with a clear error message. - Derive agentPodLabels reserved keys from selectorLabels helper to prevent future drift between Deployment selector and pod labels. - Support both K8s native object format ({ name: regcred }) and shorthand string format for imagePullSecrets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace the deleted helm-template-test.sh with helm unittest YAML files: - deployment_test.yaml: pod extensibility controls (18 tests) - env_test.yaml: polymorphic env rendering + configmap filtering (5 tests) Upstream already migrated to helm unittest (configmap_test.yaml, adapter-enablement_test.yaml). Total: 36 tests across 5 suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… helper Before: configmap used strict truthy check (`if ($cfg.discord).enabled`) while deployment and secret used default-true (`ne toString "false"`). This asymmetry came from upstream fix commits and caused different behavior for edge cases across templates. Introduces `openab.discordEnabled` helper following the same pattern as `openab.agentEnabled` and `openab.persistenceEnabled`. Returns "false" when discord config is absent or explicitly disabled, "true" otherwise. All three templates (configmap, deployment, secret) now use the helper for consistent behavior. Also strengthens the reserved-label-hijack test to assert both `app.kubernetes.io/name` and `app.kubernetes.io/component` are protected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

masami-agent

Re-review — PR #380

Previous review: APPROVED (after sidecars → extraContainers rename, rendered manifest verification, reserved key protection confirmation). This re-review covers the full 4-commit diff against current main.

Summary

Problem: Helm chart lacks common deployment controls (imagePullSecrets, probes, lifecycle, initContainers, extraContainers, extra volumes, serviceAccount, pod annotations/labels). Operators are forced to fork the chart for standard K8s deployment needs.
Approach: Phase 1 — pod/deployment extensibility only (Deployment.spec.template scope). Defers PDB, RBAC, and generic extra objects.
Risk level: Medium (chart surface area grows significantly, but all new fields are opt-in with safe defaults)

Core Assessment

Problem clearly stated: ✅ (linked #397, thorough prior art research)
Approach appropriate: ✅ (phased, focused on Deployment.spec.template only)
Alternatives considered: ✅ (minimum-only rejected, everything-at-once rejected, middle path chosen)
Best approach for now: ✅

Findings

Architecture — well-designed:

Reserved key protection via mergeOverwrite(dict, global, per-agent, reserved) with reserved merged last — prevents selector label hijacking and checksum/config override
agentPodLabels derives reserved keys from selectorLabels helper — prevents future drift
imagePullSecrets three-state design (hasKey check): omitted → inherit global, [] → opt out, [x] → replace
Both K8s native format ({name: regcred}) and shorthand string format supported
Polymorphic env: kindIs "map" branches between value: and toYaml (valueFrom), kindIs "slice" rejects lists with clear error
TOML injection fix: toJson escaping for env string values — catches a pre-existing risk on main
configmap.yaml correctly filters out map-typed env values (TOML can't represent valueFrom)

Tests — comprehensive (22 tests):

Tests 8-18: pod extensibility (imagePullSecrets, imagePullPolicy, initContainers, extraContainers, volumes, probes, lifecycle, serviceAccount, annotations, labels)
Tests 15-18: hardening (reserved label hijack, checksum hijack, imagePullSecrets opt-out, per-agent override)
Tests 19-21: env polymorphism (string, valueFrom, mixed)
Test harness bug fix: ((PASS++)) → PASS=$((PASS + 1)) for POSIX compatibility ✅
Tests 1-7 fixed to pass botToken after upstream discord.enabled gate change ✅

Rendered manifest verified (posted by contributor): indentation correct, imagePullSecrets/initContainers/livenessProbe/extraVolumes/podLabels all render properly.

Reserved key protection verified: app.kubernetes.io/name: hacked → still renders as openab ✅

Review Summary

🔴 Blockers

Branch is 29 commits behind main — must rebase before merge.

💬 Questions

obrutjack's unanswered question: What happens when ALL env vars are valueFrom maps? I can answer this from reading the diff: $stringEnv will be empty, the {{- if $stringEnv }} guard skips the env = { } line entirely, and config.toml renders a valid [agent] section without an env field. OpenAB's Rust config uses #[serde(default)] for env, so this is correct — no env line = empty HashMap. @thekkagent please confirm.

🔧 Suggested Changes

None — all previous review items have been addressed.

ℹ️ Info

This is the largest PR in the review queue (+389/-10), but the scope is cohesive (all Deployment.spec.template).
The test file is growing (now 22 tests in one shell script). Contributor acknowledged this and plans to split by feature area in a follow-up PR.
values.yaml defaults are all safe: probes/lifecycle default to {} (disabled), initContainers/extraContainers/extraVolumes default to [] (empty), imagePullSecrets default to [] (none).

⚪ Nits

None

Verdict

APPROVE (pending rebase). Code quality is high, safety design is thoughtful, test coverage is thorough. The only blocker is the 29-commit rebase against main.

obrutjack

@thekkagent Following up on my question — I checked the configmap template and config.rs myself. When all env vars are valueFrom maps, $stringEnv is empty and the env = { ... } line is omitted entirely from config.toml. The env field in Rust has #[serde(default)], so missing env defaults to empty HashMap. No parse error. Correct behavior.

Splitting PDB/RBAC into a follow-up PR is the right call.

LGTM — approving.

thekkagent · 2026-04-17T08:24:13Z

⚠️ Scope expanded + rebased — fresh re-review needed

Heads up @masami-agent @obrutjack — your earlier approvals were on a smaller scope. This push expands the PR significantly. Happy to split into two PRs if the maintainers prefer, but the scope is coherent so I've kept it together.

1. Rebased on latest `main`

Resolves the 29-commit gap. Handled upstream's helm-template-test.sh → helm-unittest migration, discord.enabled default-true change, and Slack adapter additions without conflicts.

2. Test framework migration

Upstream removed helm-template-test.sh and switched to helm-unittest YAML (configmap_test.yaml, adapter-enablement_test.yaml + chart-test.yml GHA workflow). I followed suit:

tests/deployment_test.yaml — 17 tests (pod extensibility)
tests/env_test.yaml — 5 tests (polymorphic env + configmap filtering)
tests/serviceaccount_test.yaml — 8 tests (new)
tests/rbac_test.yaml — 9 tests (new)
tests/pdb_test.yaml — 6 tests (new)

Total: 61 tests pass across 8 suites.

3. Unified `discord.enabled` gate

Noticed an asymmetry after rebase — configmap.yaml used strict truthy check while deployment.yaml and secret.yaml used default-true. Introduced openab.discordEnabled helper (same pattern as agentEnabled, persistenceEnabled) applied to all three templates.

4. 🆕 Scope expansion: chart-managed ServiceAccount + RBAC + PDB

This is the meaningful scope change. Discord feedback from @tonylee (paraphrased: "my agents need to access AWS resources and the K8s cluster, so ServiceAccount + Role setup is required") made it clear that pod extensibility alone still forces users to fork for the most common production scenarios.

Given that the RBAC/PDB work follows the same per-agent multi-resource loop pattern and uses the same helpers (agentFullname, selectorLabels), bundling them was lower cognitive overhead than splitting. All new resources are opt-in with safe defaults (serviceAccount.create, rbac.create, rbac.createClusterRole, podDisruptionBudget.enabled all default to false).

Added templates:

serviceaccount.yaml — with annotations (IRSA-ready) and automountServiceAccountToken control
role.yaml + rolebinding.yaml — namespaced RBAC
clusterrole.yaml + clusterrolebinding.yaml — cluster-scope RBAC (for K8s API access across namespaces)
pdb.yaml — supports both minAvailable and maxUnavailable, with fail-loud validation

Backward compatibility: the existing agents.<x>.serviceAccountName field still works for referencing externally-created SAs. When serviceAccount.create: true is set, the chart-created SA takes precedence.

Known caveat: PDB with minAvailable: 1 combined with the hardcoded replicas: 1 will prevent node drains. Documented in values.yaml with a ⚠️ comment recommending maxUnavailable: 1 for drain support.

5. Re: @obrutjack's question on all-valueFrom env

Confirmed. When every env var is a valueFrom map, $stringEnv is empty, the {{- if $stringEnv }} guard skips the env = {} line entirely, and config.toml renders a valid [agent] section without an env field. Rust config uses #[serde(default)] for env, so missing = empty HashMap. No parse error.

6. Also added (minor)

/component assertion in the reserved-label-hijack test (previously only /name was asserted)

Commits

e657e86 feat(helm): add ServiceAccount, RBAC, and PodDisruptionBudget templates
8dad081 refactor(helm): unify discord.enabled check via openab.discordEnabled helper
9e1e5aa test(helm): migrate bash tests to helm unittest YAML format
f6ad77c fix(helm): escape TOML env values, reject list env vars, support K8s imagePullSecrets format
c0bd68f refactor(helm): rename sidecars to extraContainers, add value comments
58a0b73 feat(helm): support valueFrom and other complex env var types
deb71f1 feat(helm): add pod extensibility controls

Ready for re-review. If the scope is too much for one PR, I can split commit e657e86 out into a follow-up.

Phase 2 of helm chart extensibility (openabdev#397): - ServiceAccount (optional, opt-in via serviceAccount.create) - Supports custom name, annotations (e.g. IRSA), and automountServiceAccountToken - Backward compat: existing serviceAccountName still works for external SAs - Role + RoleBinding (optional, opt-in via rbac.create) - ClusterRole + ClusterRoleBinding (optional, opt-in via rbac.createClusterRole) - PodDisruptionBudget (optional, opt-in via podDisruptionBudget.enabled) - Supports both minAvailable and maxUnavailable (mutually exclusive) - values.yaml warns against minAvailable: 1 with the hardcoded replicas: 1 All resources are per-agent and follow the existing multi-agent loop pattern. All defaults are opt-in (false) to preserve existing behavior. 25 new tests added across 3 suites (serviceaccount, rbac, pdb). Total: 61 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chaodu-agent · 2026-04-17T17:12:05Z

Hi @thekkagent — heads up on a direction change.

Per offline discussion among maintainers, the plan is to move the Helm charts out of this main repo and into a dedicated repository (e.g. openabdev/charts). The rationale is to decouple chart release cycles from the core application, which is a common pattern for projects with growing chart complexity.

This PR has solid work — the extensibility design, reserved-key protection, polymorphic env, RBAC/PDB templates, and test coverage are all well done. None of that effort is wasted.

However, since the charts are moving to a new repo, it would make more sense to land this PR there instead of updating the existing charts/ directory here. That way:

The new repo starts with the full extensibility surface from day one
We avoid a large merge into main that will be moved out shortly after
Chart CI/CD (lint, unittest, release) can be set up independently in the new repo

We will share the new repo details once it is created. In the meantime, this PR can stay open for reference, or you can re-target it to the new repo when it is ready.

Thanks for the thorough work on this — looking forward to landing it in the right place.

thekkagent requested a review from thepagent as a code owner April 16, 2026 01:28

github-actions bot added the closing-soon PR missing Discord Discussion URL — will auto-close in 3 days label Apr 16, 2026

thepagent assigned masami-agent Apr 16, 2026

github-actions bot removed the closing-soon PR missing Discord Discussion URL — will auto-close in 3 days label Apr 16, 2026

Joseph19820124 mentioned this pull request Apr 16, 2026

fix(helm): decouple [discord] section from botToken presence #393

Closed

masami-agent reviewed Apr 16, 2026

View reviewed changes

thekkagent mentioned this pull request Apr 16, 2026

Add pod/deployment extensibility controls to Helm chart #397

Open

thekkagent force-pushed the feat/helm-chart-extensibility branch from 6ad4050 to 0e70a04 Compare April 16, 2026 11:01

masami-agent approved these changes Apr 16, 2026

View reviewed changes

thepagent mentioned this pull request Apr 16, 2026

fix(helm): decouple [discord] section from botToken presence #403

Merged

thekkagent and others added 6 commits April 17, 2026 10:00

masami-agent reviewed Apr 17, 2026

View reviewed changes

obrutjack approved these changes Apr 17, 2026

View reviewed changes

thekkagent force-pushed the feat/helm-chart-extensibility branch from 0e70a04 to 8dad081 Compare April 17, 2026 08:22

This was referenced Apr 17, 2026

fix(helm): escape all user-supplied TOML string values with toJson snese/openab#1

Open

fix(helm): escape all user-supplied TOML string values with toJson #426

Merged

chaodu-agent added needs-rebase pending-contributor labels Apr 18, 2026

github-actions bot added needs-rebase pending-contributor and removed needs-rebase pending-contributor labels Apr 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(helm): add pod extensibility controls#380

feat(helm): add pod extensibility controls#380
thekkagent wants to merge 7 commits intoopenabdev:mainfrom
thekkagent:feat/helm-chart-extensibility

thekkagent commented Apr 16, 2026 •

edited

Loading

Uh oh!

thekkagent commented Apr 16, 2026

Uh oh!

masami-agent left a comment

Uh oh!

masami-agent commented Apr 16, 2026

Uh oh!

thekkagent commented Apr 16, 2026

Uh oh!

masami-agent left a comment

Uh oh!

obrutjack commented Apr 16, 2026

Uh oh!

masami-agent left a comment

Uh oh!

obrutjack left a comment

Uh oh!

thekkagent commented Apr 17, 2026 •

edited

Loading

Uh oh!

chaodu-agent commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

thekkagent commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this solve?

At a Glance

Prior Art & Industry Research

Proposed Solution

Safety & correctness

Why this approach?

Alternatives Considered

1. Pod extensibility only; defer SA/RBAC/PDB — initially chosen, then expanded

2. Implement everything at once — chosen

3. Extra ConfigMap / Secret creation — rejected

Validation

Uh oh!

thekkagent commented Apr 16, 2026

Follow-up: polymorphic env rendering

Changes:

Uh oh!

masami-agent left a comment

Choose a reason for hiding this comment

Review — PR #380

✅ What looks good

🔴 Must fix before merge

🟡 Non-blocking

📝 Summary

Uh oh!

masami-agent commented Apr 16, 2026

Uh oh!

thekkagent commented Apr 16, 2026

1. feat(helm): add pod extensibility controls

2. feat(helm): support valueFrom and other complex env var types

3. refactor(helm): rename sidecars to extraContainers, add value comments

4. fix(helm): escape TOML env values, reject list env vars, support K8s imagePullSecrets format

Re: test file size

Rendered manifest verification

Reserved key protection verification

Uh oh!

masami-agent left a comment

Choose a reason for hiding this comment

Re-review — PR #380

✅ Requested changes (resolved)

✅ Bonus improvements

Summary

Uh oh!

obrutjack commented Apr 16, 2026

Uh oh!

masami-agent left a comment

Choose a reason for hiding this comment

Re-review — PR #380

Summary

Core Assessment

Findings

Review Summary

🔴 Blockers

💬 Questions

🔧 Suggested Changes

ℹ️ Info

⚪ Nits

Verdict

Uh oh!

obrutjack left a comment

Choose a reason for hiding this comment

Uh oh!

thekkagent commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Rebased on latest main

2. Test framework migration

3. Unified discord.enabled gate

4. 🆕 Scope expansion: chart-managed ServiceAccount + RBAC + PDB

5. Re: @obrutjack's question on all-valueFrom env

6. Also added (minor)

Commits

Uh oh!

chaodu-agent commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

thekkagent commented Apr 16, 2026 •

edited

Loading

1. `feat(helm): add pod extensibility controls`

2. `feat(helm): support valueFrom and other complex env var types`

3. `refactor(helm): rename sidecars to extraContainers, add value comments`

4. `fix(helm): escape TOML env values, reject list env vars, support K8s imagePullSecrets format`

thekkagent commented Apr 17, 2026 •

edited

Loading

1. Rebased on latest `main`

3. Unified `discord.enabled` gate